深度强化学习(DRL)是一种有前途的方法,可以通过与环境的互动来学习政策来解决复杂的控制任务。但是,对DRL政策的培训需要大量的培训经验,这使得直接了解物理系统的政策是不切实际的。 SIM到运行的方法可以利用模拟来验证DRL政策,然后将其部署在现实世界中。不幸的是,经过验证的政策的直接现实部署通常由于不同的动态(称为现实差距)而遭受性能恶化。最近的SIM到现实方法,例如域随机化和域的适应性,重点是改善预审预告剂的鲁棒性。然而,经过模拟训练的策略通常需要使用现实世界中的数据来调整以达到最佳性能,这是由于现实世界样本的高成本而具有挑战性的。这项工作提出了一个分布式的云边缘建筑,以实时培训现实世界中的DRL代理。在体系结构中,推理和训练被分配到边缘和云,将实时控制循环与计算昂贵的训练回路分开。为了克服现实差距,我们的体系结构利用了SIM到现实的转移策略,以继续在物理系统上训练模拟预言的代理。我们证明了其在物理倒置螺旋控制系统上的适用性,分析了关键参数。现实世界实验表明,我们的体系结构可以使验证的DRL代理能够始终如一,有效地看不见动态。
translated by 谷歌翻译
The analysis of network structure is essential to many scientific areas, ranging from biology to sociology. As the computational task of clustering these networks into partitions, i.e., solving the community detection problem, is generally NP-hard, heuristic solutions are indispensable. The exploration of expedient heuristics has led to the development of particularly promising approaches in the emerging technology of quantum computing. Motivated by the substantial hardware demands for all established quantum community detection approaches, we introduce a novel QUBO based approach that only needs number-of-nodes many qubits and is represented by a QUBO-matrix as sparse as the input graph's adjacency matrix. The substantial improvement on the sparsity of the QUBO-matrix, which is typically very dense in related work, is achieved through the novel concept of separation-nodes. Instead of assigning every node to a community directly, this approach relies on the identification of a separation-node set, which -- upon its removal from the graph -- yields a set of connected components, representing the core components of the communities. Employing a greedy heuristic to assign the nodes from the separation-node sets to the identified community cores, subsequent experimental results yield a proof of concept. This work hence displays a promising approach to NISQ ready quantum community detection, catalyzing the application of quantum computers for the network structure analysis of large scale, real world problem instances.
translated by 谷歌翻译
Wireless Sensor Network (WSN) applications reshape the trend of warehouse monitoring systems allowing them to track and locate massive numbers of logistic entities in real-time. To support the tasks, classic Radio Frequency (RF)-based localization approaches (e.g. triangulation and trilateration) confront challenges due to multi-path fading and signal loss in noisy warehouse environment. In this paper, we investigate machine learning methods using a new grid-based WSN platform called Sensor Floor that can overcome the issues. Sensor Floor consists of 345 nodes installed across the floor of our logistic research hall with dual-band RF and Inertial Measurement Unit (IMU) sensors. Our goal is to localize all logistic entities, for this study we use a mobile robot. We record distributed sensing measurements of Received Signal Strength Indicator (RSSI) and IMU values as the dataset and position tracking from Vicon system as the ground truth. The asynchronous collected data is pre-processed and trained using Random Forest and Convolutional Neural Network (CNN). The CNN model with regularization outperforms the Random Forest in terms of localization accuracy with aproximate 15 cm. Moreover, the CNN architecture can be configured flexibly depending on the scenario in the warehouse. The hardware, software and the CNN architecture of the Sensor Floor are open-source under https://github.com/FLW-TUDO/sensorfloor.
translated by 谷歌翻译
One of the major challenges in Deep Reinforcement Learning for control is the need for extensive training to learn the policy. Motivated by this, we present the design of the Control-Tutored Deep Q-Networks (CT-DQN) algorithm, a Deep Reinforcement Learning algorithm that leverages a control tutor, i.e., an exogenous control law, to reduce learning time. The tutor can be designed using an approximate model of the system, without any assumption about the knowledge of the system's dynamics. There is no expectation that it will be able to achieve the control objective if used stand-alone. During learning, the tutor occasionally suggests an action, thus partially guiding exploration. We validate our approach on three scenarios from OpenAI Gym: the inverted pendulum, lunar lander, and car racing. We demonstrate that CT-DQN is able to achieve better or equivalent data efficiency with respect to the classic function approximation solutions.
translated by 谷歌翻译
To enable a safe and effective human-robot cooperation, it is crucial to develop models for the identification of human activities. Egocentric vision seems to be a viable solution to solve this problem, and therefore many works provide deep learning solutions to infer human actions from first person videos. However, although very promising, most of these do not consider the major challenges that comes with a realistic deployment, such as the portability of the model, the need for real-time inference, and the robustness with respect to the novel domains (i.e., new spaces, users, tasks). With this paper, we set the boundaries that egocentric vision models should consider for realistic applications, defining a novel setting of egocentric action recognition in the wild, which encourages researchers to develop novel, applications-aware solutions. We also present a new model-agnostic technique that enables the rapid repurposing of existing architectures in this new context, demonstrating the feasibility to deploy a model on a tiny device (Jetson Nano) and to perform the task directly on the edge with very low energy consumption (2.4W on average at 50 fps).
translated by 谷歌翻译
The Transformer architecture is shown to provide a powerful framework as an end-to-end model for building expression trees from online handwritten gestures corresponding to glyph strokes. In particular, the attention mechanism was successfully used to encode, learn and enforce the underlying syntax of expressions creating latent representations that are correctly decoded to the exact mathematical expression tree, providing robustness to ablated inputs and unseen glyphs. For the first time, the encoder is fed with spatio-temporal data tokens potentially forming an infinitely large vocabulary, which finds applications beyond that of online gesture recognition. A new supervised dataset of online handwriting gestures is provided for training models on generic handwriting recognition tasks and a new metric is proposed for the evaluation of the syntactic correctness of the output expression trees. A small Transformer model suitable for edge inference was successfully trained to an average normalised Levenshtein accuracy of 94%, resulting in valid postfix RPN tree representation for 94% of predictions.
translated by 谷歌翻译
网络流问题涉及通过网络分配流量,以便有效地使用基础基础架构,在运输和物流上无处不在。由于数据驱动的优化的吸引力,这些问题已越来越多地使用图形学习方法解决。其中,鉴于其通用性,多商品网络流(MCNF)问题特别感兴趣,因为它涉及多个来源和水槽之间不同大小的多个流量(也称为需求)的分布。我们关注的广泛使用的目标是给定流量需求和路由策略的网络中任何链接的最大利用。在本文中,我们针对MCNF问题提出了一种基于图形神经网络(GNN)的新方法,该方法沿每个链接使用明显的参数化消息函数,类似于所有边缘类型都是唯一的关系模型。我们表明,我们所提出的方法比现有的图形学习方法获得了可观的收益,这些方法不必要地限制了路由。我们使用17个服务提供商拓扑和两个流程路由方案通过互联网路由案例研究广泛评估所提出的方法。我们发现,在许多网络中,MLP与不使用我们机制的通用GNN具有竞争力。此外,我们阐明了图结构与数据驱动的流动路由的难度之间的关系,该方面在该地区现有工作中尚未考虑。
translated by 谷歌翻译
在本报告中,我们描述了我们提交给Epic-Kitchens-100无监督的域适应(UDA)挑战的技术细节。为了应对UDA设置下存在的域移位,我们首先利用了最近的域概括(DG)技术,称为相对规范对准(RNA)。其次,我们将这种方法扩展到无标记的目标数据工作,从而使模型更简单地以无监督的方式适应目标分布。为此,我们将UDA算法包括在内,例如多级对抗对准和专心熵。通过分析挑战设置,我们注意到数据中存在二次并发转移,通常称为环境偏见。它是由存在不同环境(即厨房)引起的。为了处理这两个班次(环境和时间段),我们扩展了系统以执行多源多目标域的适应性。最后,我们在最终提案中采用了不同的模型来利用流行视频体系结构的潜力,并为合奏改编介绍了两次损失。我们的提交(条目“ PLNET”)在排行榜上可见,并在“动词”中排名第二,并且在“名词”和“ Action”中都处于第三位。
translated by 谷歌翻译
端到端语音合成模型直接将输入字符转换为音频表示(例如频谱图)。尽管表现令人印象深刻,但此类模型仍很难消除相同拼写单词的发音。为了减轻此问题,可以在合成音频之前将单独的字素至phoneme(G2P)模型转换为音素。本文提出了SoundChoice,这是一种新颖的G2P体系结构,可以处理整个句子而不是在单词级别上操作。所提出的体系结构利用了加权同型损失(改善了歧义),利用课程学习(逐渐从单词级别切换到句子级别的G2P),并整合了Bert的单词嵌入(以进一步提高性能提高)。此外,该模型在语音识别中继承了最佳实践,包括使用Connectionist暂时分类(CTC)的多任务学习和带有嵌入式语言模型的光束搜索。结果,SoundChoice使用LibrisPeech和Wikipedia的数据实现了全句转录的音素错误率(PER),为2.65%。索引术语字素至音量,语音综合,文本传播,语音,发音,歧义。
translated by 谷歌翻译
变形金刚最近在语音分离中实现了最先进的表现。但是,这些模型是计算需求的,需要大量可学习的参数。本文以降低的计算成本探讨了基于变压器的语音分离。我们的主要贡献是基于自我注意力的架构的资源效率分离变压器(重新启动)的开发,可通过两种方式减轻计算负担。首先,它在潜在空间中使用非重叠的块。其次,它以从每个块计算出的紧凑型潜在摘要运行。重新宣传者在受欢迎的WSJ0-2MIX和WHAM上取得了竞争性能!因果关系中的数据集。值得注意的是,就记忆和推理时间而言,它比以前的基于变压器和基于RNN的体系结构的缩放量明显好,这使其更适合处理长时间的混合物。
translated by 谷歌翻译